61 research outputs found
Everything is Connected: Graph Neural Networks
In many ways, graphs are the main modality of data we receive from nature.
This is due to the fact that most of the patterns we see, both in natural and
artificial systems, are elegantly representable using the language of graph
structures. Prominent examples include molecules (represented as graphs of
atoms and bonds), social networks and transportation networks. This potential
has already been seen by key scientific and industrial groups, with
already-impacted application areas including traffic forecasting, drug
discovery, social network analysis and recommender systems. Further, some of
the most successful domains of application for machine learning in previous
years -- images, text and speech processing -- can be seen as special cases of
graph representation learning, and consequently there has been significant
exchange of information between these areas. The main aim of this short survey
is to enable the reader to assimilate the key concepts in the area, and
position graph representation learning in a proper context with related fields.Comment: To appear in Current Opinion in Structural Biology. 14 pages, 1
figur
X-CNN: Cross-modal Convolutional Neural Networks for Sparse Datasets
In this paper we propose cross-modal convolutional neural networks (X-CNNs),
a novel biologically inspired type of CNN architectures, treating gradient
descent-specialised CNNs as individual units of processing in a larger-scale
network topology, while allowing for unconstrained information flow and/or
weight sharing between analogous hidden layers of the network---thus
generalising the already well-established concept of neural network ensembles
(where information typically may flow only between the output layers of the
individual networks). The constituent networks are individually designed to
learn the output function on their own subset of the input data, after which
cross-connections between them are introduced after each pooling operation to
periodically allow for information exchange between them. This injection of
knowledge into a model (by prior partition of the input data through domain
knowledge or unsupervised methods) is expected to yield greatest returns in
sparse data environments, which are typically less suitable for training CNNs.
For evaluation purposes, we have compared a standard four-layer CNN as well as
a sophisticated FitNet4 architecture against their cross-modal variants on the
CIFAR-10 and CIFAR-100 datasets with differing percentages of the training data
being removed, and find that at lower levels of data availability, the X-CNNs
significantly outperform their baselines (typically providing a 2--6% benefit,
depending on the dataset size and whether data augmentation is used), while
still maintaining an edge on all of the full dataset tests.Comment: To appear in the 7th IEEE Symposium Series on Computational
Intelligence (IEEE SSCI 2016), 8 pages, 6 figures. Minor revisions, in
response to reviewers' comment
Recommended from our members
The resurgence of structure in deep neural networks
Machine learning with deep neural networks ("deep learning") allows for learning complex features directly from raw input data, completely eliminating hand-crafted, "hard-coded" feature extraction from the learning pipeline. This has lead to state-of-the-art performance being achieved across several---previously disconnected---problem domains, including computer vision, natural language processing, reinforcement learning and generative modelling. These success stories nearly universally go hand-in-hand with availability of immense quantities of labelled training examples ("big data") exhibiting simple grid-like structure (e.g. text or images), exploitable through convolutional or recurrent layers. This is due to the extremely large number of degrees-of-freedom in neural networks, leaving their generalisation ability vulnerable to effects such as overfitting.
However, there remain many domains where extensive data gathering is not always appropriate, affordable, or even feasible. Furthermore, data is generally organised in more complicated kinds of structure---which most existing approaches would simply discard. Examples of such tasks are abundant in the biomedical space; with e.g. small numbers of subjects available for any given clinical study, or relationships between proteins specified via interaction networks. I hypothesise that, if deep learning is to reach its full potential in such environments, we need to reconsider "hard-coded" approaches---integrating assumptions about inherent structure in the input data directly into our architectures and learning algorithms, through structural inductive biases. In this dissertation, I directly validate this hypothesis by developing three structure-infused neural network architectures (operating on sparse multimodal and graph-structured data), and a structure-informed learning algorithm for graph neural networks, demonstrating significant outperformance of conventional baseline models and algorithms.The work depicted in this dissertation was in part supported by funding from the European Union's Horizon 2020 research and innovation programme PROPAG-AGEING under grant agreement No 634821
Parallel Algorithms Align with Neural Execution
Neural algorithmic reasoners are parallel processors. Teaching them
sequential algorithms contradicts this nature, rendering a significant share of
their computations redundant. Parallel algorithms however may exploit their
full computational power, therefore requiring fewer layers to be executed. This
drastically reduces training times, as we observe when comparing parallel
implementations of searching, sorting and finding strongly connected components
to their sequential counterparts on the CLRS framework. Additionally, parallel
versions achieve strongly superior predictive performance in most cases.Comment: 8 pages, 5 figures, To appear at the KLR Workshop at ICML 202
- ā¦